Inference and optimization of real edges on sparse graphs - a statistical physics 

perspective 
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Inference and optimization of real-value edge variables in sparse graphs are studied using the 
Bethe approximation and replica method of statistical physics. Equilibrium states of general energy 
functions involving a large set of real edge- variables that interact at the network nodes are obtained 
in various cases. When applied to the representative problem of network resource allocation, efficient 
distributed algorithms are also devised. Scaling properties with respect to the network connectivity 
and the resource availability are found, and links to probabilistic Bayesian approximation methods 
are established. Different cost measures are considered and algorithmic solutions in the various cases 
are devised and examined numerically. Simulation results are in full agreement with the theory. 
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I. INTRODUCTION 

The links between statistical physics models and a va- 
riety of inference and optimization problems have been 
significantly strengthened over the last decade Two 
aspects of these links have been exploited. Macroscop- 
ically, using the statistical physics framework, one de- 
scribes typical properties of the problem and provides 
valuable insight into its generic characteristics. Micro- 
scopically, established techniques of statistical physics 
such as the cavity method have been used for devising 
efficient inference algorithms, some of which have been 
independently discovered and used in other research com- 
munities 0,1a II II- 

Most studies so far, both within and outside the sta- 
tistical physics community, have focused on cases of dis- 
crete variables. Among the recently successful exam- 
ples using methods of statistics-based mechanics, one 
can mention hard computational problems H and error- 
correcting codes • Statistical mechanical approaches to 
learning of discrete variables have also been considered 
on tree structures Q. 

On the other hand, networks of continuous variables 
were much less explored. One of the main reasons for 
this limited activity is the difficulty in applying message 
passing approximation algorithms 0, II in this case, as 
the discrete messages passed between variables become 
functions of real variables. Applied message passing for 
systems of real variables typically relies on modelling the 
functions using a reduced number of parameters @ . 

In the statistical physics community there have been 
recent attempts to simplify the messages for continu- 
ous variables. For example, a step forward was made 
in Ref. [10( to parametrize the messages using eigenfunc- 
tion decomposition for special cases. Furthermore, the 
continuous variables treated by these methods are local- 
ized on nodes, whereas many interesting problems, such 
as the resource allocation problem presented here (and 
partially in [ll|, [l2j]) involves real variables defined on 
links between nodes. 



In this paper we study a system with real variables that 
can be mapped onto a sparse graph and suggest an ef- 
ficient message-passing approximation method for infer- 
ence and optimization. We first formulate the problem at 
a general temperature; the message-passing algorithm we 
present here as well as the related analysis are primarily 
general inference algorithms. In this paper, however, we 
are particularly interested in the optimal, zero temper- 
ature, solution that reduces the task to an optimization 
problem. 

Global optimization techniques, such as linear or 
quadratic programming [l3j can successfully solve many 
of these problems. However, message-passing approaches 
have the potential to solve global optimization problems 
via local updates, thereby reducing the growth in compu- 
tational complexity from cubic to linear with the system 
size. An even more important practical advantage is its 
distributive nature that is particularly suitable for dis- 
tributive computation in large or evolving networks and 
does not require a global optimizer. 

We focus on a prototype for optimization, and use the 
example of resource allocation as a vehicle to demon- 
strate the potential of our method, both for gaining in- 
sights into the main properties of the system and as an 
efficient optimization algorithm. Our method is efficient 
since the messages consist of only the first and second 
derivatives of the vertex free energies derived from our 
analysis. The key to the successful simplification, not 
needed for the simpler case of discrete variables, is that 
the messages passed to a target node are accompanied by 
information-provision messages from the target node, to 
first determine the working point at which the derivatives 
should be calculated. 

The problem of resource allocation is a well known 
network problem in the areas of computer science and 
operations management [2 HI. The problem itself 
is quite general and is applicable to typical situations 
where a large number of nodes are required to bal- 
ance loads/resources, such as reducing internet traffic 
congestion and streamlining network flow of commodi- 
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ties [ljl Oj} ■ In computer science, many practical algo- 
rithmic solutions have been proposed to distribute com- 
putational load between computers connected in a net- 
work. They usually are heuristic and focus on practical 
aspects (e.g., communication protocols) . The problem we 
are addressing here is more generic and, in the context of 
computer networks, is represented by nodes of some com- 
putational power that should carry out tasks; sub-tasks 
are then moved around such that all demands will be 
satisfied while the migration of (sub-)tasks is minimized. 

In section [TT| we will introduce the general model, fol- 
lowed by a replica-based analysis in section IIIII and sub- 
sequently by a Bethe approximation-based analysis in 
section IIV1 A message passing algorithm for the prob- 
lem of resource allocation will be presented in section fVl 
followed by the derivation of scaling laws in the limit of 
high connectivity in section lVll Numerical results for sec- 
tions [TTTJVl] will be presented in section lyTIl We will then 
extend the model to the case of general cost functions in 
section IVIII| highlighting strengths and weaknesses of 
our approach. We will conclude the presentation with a 
summary and point to future research directions. 



II. THE MODEL 

The problem we are addressing here is a generic ver- 
sion of resource allocation and serves as an example of a 
sparsely connected system of real variables that should 
be optimized with respect to some general cost. It is 
represented by nodes of some computational power that 
should carry out tasks. Both computational powers and 
tasks will be chosen at random from some arbitrary dis- 
tribution. The nodes are located on a randomly chosen 
sparse network of some connectivity. The goal is to mi- 
grate tasks on the network such that demands will be 
satisfied while the migration of (sub-)tasks is minimized. 
We focus here on the satisfiable case where the total com- 
puting power is greater than the demand, and where the 
number of nodes involved is very large. 

The sparse network considered has N nodes, labelled 
i = 1,...,N. Each node i is randomly connected to 
c other nodes. The connectivity matrix is given by 
Aij = Aji = 1,0 for connected and unconnected node 
pairs respectively. A link variable y^ is defined on each 
connected link from j to i. We focus on the case of inten- 
sive connectivity c ~ O(l) <C N; and restrict the problem 
to the fixed connectivity case although both the analy- 
sis and the algorithm devised on its basis can handle a 
general connectivity profile. 

We consider a general energy function (cost) 

E = ^2A ij 4>{y ij ) +Y^^{K{Vij\Aj = 1}) , 

(»i) 1 

where the summation (ij) is made over all node pairs, 
and A, is a quenched variable defined on node i. In the 
context of probabilistic inference, y^ may represent the 



coupling between observables in nodes j and i, 4>{yij) 
may correspond to the logarithm of the prior distribu- 
tion of yij, and ip(Ai, {yij\Aij — 1}) the logarithm of 
the likelihood of the observables A$. Since the cost is 
independent of the direction of the currents in many ap- 
plications, we focus on the case that </>(y) is a general 
even function of y. In the context of resource allocation, 
yij = —yji may represent the current from node j to i, 
4>{yij) may correspond to the transportation cost, and 
ip(Ai, {yij\Aij — 1}) the performance cost of the alloca- 
tion task on node i, dependent on the node capacity A»; 
the capacity of a node is defined as its computational 
capability minus its computational demand, and is ran- 
domly drawn from a distribution p(Ai). 

III. REPLICA ANALYSIS 

To make the analysis more concrete and strengthen the 
link to the resource allocation problem, we keep the term 
4>{yij) general and, aiming to satisfy the capacity con- 
straints, set rp(Ai,{yij\Aij = 1}) = ln[6 (— J2j Ajl/ij - 
Ai) + e], where e — > and O is the step function. This 
reduces the problem to the load balancing task of min- 
imizing the energy function (cost) E — Y^Uj) •^4j4'(yij)j 
subject to the constraints on the resources of nodes i, 

^AijVij ; + A t > , (1) 

j 

An alternative formulation is to consider the dual of 
the original optimization problem. Introducing Lagrange 
multipliers, the function to be minimized becomes 

L = Y1 Ai i ^ J+E^ 1 E w + A A ■ ( 2 ) 
m i V i J 

Optimizing L with respect to y^ , one obtains 

yij = W\~ l {H ~ Mi). (3) 

where fii is referred to as the chemical potential of node i, 
and (f>' is the derivative of <p with respect to its argument. 
This can be interpreted as the current being driven by 
the potential difference. 

Since the probability of finding loops of finite lengths 
is vanishing in large sparse networks, the structure of a 
sparse network is locally a tree. Thus, given a configura- 
tion of currents {yij}, one can set the current potential 
Vi of a node to an arbitrary value, and assign Vj of its 
neighbors according to Vj = v L + y^ . Repeating this as- 
signment process to next nearest neighbors and so on, the 
current potentials of all nodes in the tree can be deter- 
mined. Hence, the current potentials can be considered 
as valid independent variables as the current variables 
used originally. This implies that we can consider the 
optimization problem in the space of the current poten- 
tials. Since the energy function is invariant under the 
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addition of an arbitrary global constant to the current 
potentials of all nodes, we introduce an extra regulariza- 
tion term e J2i Mi /2 to break the translational symmetry, 
where e — > 0. (Note that the current potentials v are 
different from the chemical potentials which are the 
Lagrange multipliers of the dual formulation in Eq. ([2|). 
Only for the quadratic cost <p(y) — y 2 jl can the cur- 
rent be expressed in terms of the difference in chemical 
potentials. Even in this case, the two potentials may dif- 
fer by a non-vanishing constant since the resource con- 
straints in Eq. |T]) imply that the maximum of the La- 
grange multipliers is 0, whereas the current potentials 
minimize ^ 2 /2 and are unlikely to have a maximum 
value of 0.) The corresponding partition function is 



z =n /^n e 



x exp 



(4) 



The replicated partition function [l| , at a temperature 
T = averaged over all network configurations of 

connectivity c and capacity distributions /o(Aj), is given 
by 



(z n : 



A,A : 



h E nHEA 



A,-0,1 i 




exp I -H X Aj^iy" - «?) - % J>?) 2 ) -(5) 



Here Af = Y^Ai =o l Ili -Aij — c) is the total number 
of gra phs with connectivity c. This can be easily shown 
to be tH Af = exp {TV [-(c/2) + (c/2) In(dV) - lnc!]}. 

The interaction coupling current potentials of differ- 
ent nodes makes it difficult to decouple them in order 
to define macroscopic order parameters. Nevertheless, 
additional expansions detailed in Appendix A also show 
that it is possible to dientangle neighboring node indices. 
(This justifies the formulation of the optimization in the 
space of the current potentials {i/j} rather than that of 
the currents {yij}-) This leads to the following definition 
of the order parameters 

Q r , s = -L=£z. t exp fe^finH^r^) 8 ", 

* i \ a /a 

and its conjugate Q r s . Following the analysis of [l8| and 



averaging over the connectivity tensor A one finds 
+ ln J dAp(A)H (j dv a J™d\ a J ^) 



x exp 



^ ( iX a (X a + CV a ) - ^-{v a ) 2 



X' 



where 



(7) 



(8) 



-iXr 



d 

dy 



The somewhat unusual indices of the order parameters 
Q r ,s and Q r , s , the vectors r and s, represent n-component 
vectors (r\, . . . , r n ) and (si, . . . , s„) respectively. This is 
a result of the specific interaction considered which en- 
tangles nodes of different indices. The order parameters 
Q r ,s and <5 r s are given by the extremum condition of 
Eq. (|7J), i.e., via a set of saddle point equations with re- 
spect to the order parameters. To facilitate the solution, 
we introduce the generating function of P s (z) and its in- 
version, 



p s ( z )=^g r , s ]J 



(9) 



Eliminating Q r ,s, and substituting the saddle point equa- 
tion of Q rs into P s ( z ) in Eq. ([9]), one finds the recursion 
relation 

P s (z) = ±-j dkp(h) Y\ J dv a d\ a J ^ 

f3e 



x exp ^iA Q (A Q + cv a — z a ) 

><n 



En 

Sfe fc=l 



la S k' 



-iXr 



-P<Kv) 



(10) 



where Dp is a constant given in Eq. (|A9|) . Note that 
P s (z) is expressed in terms of c— 1 functions P Sk (u) (k = 
1, ..,c— 1), integrated over v and summed over s&. This 
structure is typical of the Bethe lattice description of 
networks of connectivity c, explained in Section llVl where 
nodes are divided into generations. Each node provides 
input to an ancestor node and receives input from c — 1 
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descendent nodes. This forms a tree structure, in which 
the state of a node depends on those of its subsequent 
generations. 

In order to derive a set of recursive equations one 
should make an assumption about the inherent symme- 
tries of the problem. Here we employ the replica sym- 
metric ansatz. In previous treatment of related problems, 
the order parameters are represented as an integral over 
moments of the corresponding probability distribution, 
incorporating the permutation invariance of the replica 
indices Generalizing to the case of P s (z), which 

is an order parameter depending on the continuous vari- 
ables z, the ansatz takes the form 



P s (z) = (T[ Q dv R(z a ,v\T)v 




(11) 



where T represents the tree terminated at the vertex 
node with current potential v, providing input to the an- 
cestor with chemical potential z, and (. . .)a represents 
the average of the capacities of each node of the tree over 
the distribution p(A). Note that the replicas are coupled 
through their common dependence on the quenched vari- 
ables A. This is in contrast to conventional derivations, 
such as the SK model [l|, in which the dependence on 
the disorder is integrated out, leading to more explicit 
inter-replica dependencies. 

The resultant recursion relation for the function R is 
independent of the replica indices, and hence remains 
valid in the n — > limit. It is given by 



1 c_1 T f 

R(z, i/|T) = — Y[ / du k R{u, v k \T k 

R k=l 

X0 I y"Vfc -CV + Z + Ay( T ) 



\k=l 



x exp 



/3e 



k=l 



(12) 



where Dr is a constant given in Eq. (jAlOj) . and Tfc rep- 
resents the tree terminated at the fcth descendent of the 
vertex, Ayi?) is the capacity of the vertex of the tree 



T. Eq. (fT2|) expresses R(z, v\T) in terms of c— 1 func- 
tions R{v, ffc|T/-) (k — l,..,c— 1), integrated over v k . 
Again, this is characteristic of the Bethc lattice struc- 
ture. Furthermore, except for the factor exp(— /3ei/ 2 /2), 
a self-consistent solution of R is that it is a function of 
y = h>—z, which is interpreted as the current drawn from a 
node with current potential v by its ancestor with current 
potential z. Hence we will express the function R as the 
product of a vertex 'partition function Zy and a normal- 
ization factor W , which contains any residual dependence 
on v. Since e is taken to approach zero in the analysis, 
it is expected that W should approach a constant inde- 
pendent of v. Hence we let R{z, v\T) = W(v)Z v (y\T). 
As explained in Appendix in the limit e — > 0, the 
dependence on v and y decouples; this enables one to 



derive a recursion relation for the vertex free energy [191 ] 
Fy(y\T) = —Tin Zy(y\T) when a current y is drawn from 
the vertex of a tree T [2fj| , 



Fy(y\T) = -ThJl[ (| dy k ^j efc. 



+Ay (T ) exp 



fe=l 

c-1 



,yk-y 



x exp 



P^2(F v (y k \T k ) + ( f>(y k )) 
fc=i 

Tln |n(/ dy^eN^yk+A 

c 

f3^2(F v (y k \T k ) + ^(y k )) 




(13) 



In the zero temperature limit, this recursion relation re- 
duces to 



F v (y\T) 



telEt 1 1 »-!'+ A v(T)>o} 



^(Fv(y fe |T fe ) + 0(y fc )) 



k=l 



mm 

Avk\T.'LiVk+^v>o} 



J2(Fv(y k \T k ) + <b(y k )) 



Lk=l 




The solution of Eqs. (fl3|) or (fT4|) can be obtained numer- 
ically in a recursive manner, since the vertex free energy 
of a node depends on its own capacity and the disordered 
configuration of its descendents. 

Using the replica approach, and following the deriva- 
tion of Appendix [3 the averaged free energy of the net- 
work is given by 



(F) A = -N(Tlnl^f[ y dy k ^j 9 (f^ 



Vk + M 



x exp 



-/^(iV(y fe |T fe ) + 0(y fc )) 



(15) 



The current distribution and the average energy per 
link can be derived, using the calculated vertex free en- 
ergy, by integrating the current y' in a link from one 
vertex to another, fed by the trees Ti and T2, respec- 
tively; the obtained expressions are P{y) — (S(y — J/'))* 
and (<j)) = (<j)(jj')) i , where 



<•>* = 



/ri^expH^fa'lTi.Ta)] (.) 
f dy'e^>[-PF L (y'\T u T 2 )} 



(16) 



with 



F L {y'\T u T 2 ) = Fviy'lTi) + F v (~y'\T 2 ) + cP(y'). (17) 
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IV. RECURSION RELATION AND FREE 
ENERGY IN THE BETHE APPROACH 

The results in Section HTT1 can be interpreted using the 
Bethe approach. Since the connectivity c is low, the prob- 
ability of finding a loop of finite length on the graph is 
low, and the Bethe approximation well describes the local 
environment of a node. In this approximation, a node is 
connected to c branches in a tree structure, and the cor- 
relations among the branches of the tree are neglected. 
In each branch, nodes are arranged in generations. A 
node is connected to an ancestor node of the previous 
generation, and another c — 1 descendent nodes of the 
next generation. 

Consider a vertex V(T) of capacity Ay( T ), and a cur- 
rent y is drawn from the vertex. At a temperature 
T = /3 _1 , one can write an expression for the free en- 
ergy F(y\T) as a function of the free energies F(yk\T k ) 
of its descendants, that branch out from this vertex 

F(y\T) = - Tln |ll (/ 9 (X>-y+ A ^ 



x exp 



-pJ2(Myk\T k ) + <f>(y k )) 



k=l 



(18) 



where T k represents the tree terminated at the k th de- 
scendent of the vertex. The free energy can be considered 
as the sum of two parts 

F(y\T)=N T F 3 , v +F v (y\T), 

where AT T is the number of nodes in the tree T, F av is the 
average free energy per node, and Fv(y\T) is the vertex 
free energy. 

This allows one to decompose the free energy into 

/c-l \ c-l 

I>t» + 1 }F av + F v (y\T) = J2 N Tk F av 



\fc=i 



k=l 



y k -y + A y(T) 



x exp 



c-l 



-pJ2(F v (y k \T k )+4>(y k )) 



k=l 



(19) 



To determine the F av , we consider the effects of adding 
a vertex V which is fed by c individual trees T 1; . . . , T c . 
The total free energy is then 

(j2 N T k + IJ Fav = -( Tln {n (/ d »*) 



e yk + Ay exp 



\k=l 



k=l 



pJ2[N Tk F av +F v (y k \T k 



+<t>{Vk) 



(20) 



Rearranging the terms one obtains a recursion relation 
identical to Eq. (|13p . The average free energy per node 
is given by the second term of Eq. (fT3| , and has the same 
expression for the free energy as in the replica approach 

(US}. 

The recursion relation can also be recast into a form 
reminiscent of those commonly appearing in Bethe lat- 
tices of Ising spin variables, such as in Refs. [g, LZI, Il8j ]. 
This is achieved by considering the probability distribu- 
tion of vertex free energies P[-FV]- Using Eq. (fT4|) . 

c— 1 

P[F V ] = JdA vP (A v )Y, J-DF Vk P[F Vk ] 



fe=i ' 



n^-Thr|n(/^)e(£ : 



Vk-V +Ay 



x exp 



c-l 



-0^2(F vk (y k )+cl>(y k )) 



k=l 



-(F) A -F v (y) 



(21) 



Comparing with Bethe lattices of Ising variables, the ver- 
tex free energy Fy plays the role of a cavity field. The 
difference here is that the distribution to be iterated is no 
longer a function of a single cavity variable. Rather, the 
distribution is defined in the space of cavity free energy 
functions, since we are dealing with continuous variables. 
This parallelism enables us to solve the recursion relation 
by population dynamics. At each step of this approach, a 
new ancestor node is generated at random, and its vertex 
free energy is updated. 

It is interesting to point out that the iterative equa- 
tions (fl3|) can be directly linked to those obtained from a 
principled Bayesian approximation, where the logarithms 
of the messages passed between nodes are proportional 
to the vertex free energies. This is shown explicitly in 
Appendix [5] 



V. THE MESSAGE-PASSING ALGORITHM 

The local nature of the recursion relation (TT3|) points 
to the possibility that the network optimization can be 
solved by message passing approaches, However, in con- 
trast to other message passing algorithms which pass con- 
ditional probability estimates of discrete values to neigh- 
boring nodes, the messages in the present context are 
more complex, since they are functions Fv(y\T) of the 
current y. 

The derivation of the algorithm can be viewed as a 
minimization of the cost function with respect to current 
changes under the capacity constraint at the neighbor- 
ing nodes. When the cost is quadratic, the impact of 
current changes can be described through the first and 
second derivatives with respect to the vertex free energy. 
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As will be explained at the end of this section, this two- 
component message is sufficient to provide the exact so- 
lution, as long as the algorithm converges. 

We follow this route and simplify the message to 
two parameters, namely, the first and second deriva- 
tives of the vertex free energies. Let (Ay , .By ) = 
(dF v (y ij \T j )/dy, J ,d 2 F v {y lj \T j )/dyf 3 ) be the message 
passed from node j to i. Based on the messages received 
from the descendents k ^ i, the vertex free energy from 
j to i can be obtained by minimizing the free energy in 
the space of the current adjustments Ejk drawn from the 
descendents. Using Eq. (|14p . we minimize 



k^i 



1 



1 



subject to the constraint 



(22) 



E AMyjk + £jk) ~ Vij + A, > 0, (23) 

where <//- fe and 0" fc represent the first and second deriva- 
tives of </>(y) at y — yjk respectively. Introducing the 
Lagrange multiplier /zy , the optimal solution is given by 

F *j = \ E A * K- - (A/fc + <t>'jkf] (Bjk+^k)- 1 (24) 



k=£i 



where 



+Aj - y tj 



A 3k [y 3 k - (A jk + <t>' jk ){B jk + <(>%) x ] 

.k^i 

-1 — 1 s 



,0 



(25) 



The first and second derivatives of Fh with respect to yy 
lead to the forward message (Aij , -By) from node j to i, 



As 



B; 



S(-Mij + e ) 



Ek^^Mk + ^k)- 1 



(26) 



jk) 



We note in passing that when the descendent currents 
yjk change continuously, both the vertex free energy |24|) 
and the chemical potential (|2"5")l change continuously for 
functions 4>(y) with continuous first derivatives. Hence 
for the quadratic load balancing task, defined by cj)(y) = 
y 2 /2, a self-consistent solution of the recursion relation 
Eq. (TTJ} consists of vertex free energies which are piece- 
wise quadratic with continuous slopes. This makes the 
2-parameter message a very precise approximation. 

In principle, if the forward messages consist of the full 
vertex free energy functions, then they are already suffi- 
cient for the optimization task. However, since the mes- 
sages are simplified to be the first and second derivatives 
of the vertex free energies, each node needs to estimate 
the optimal currents by approximating the vertex free 



energy function as a quadratic function. Hence, the re- 
maining step of the algorithm is the determination of the 
drawn current yy at which the derivatives comprising 
the messages should be computed. This determination 
of the working point is achieved by passing additional 
information-provision messages among the nodes, a step 
not present in conventional message-passing algorithms. 
The following two methods are proposed for this purpose. 

In the first method, when messages are sent from node 
j to ancestor node i, backward messages yjk computed 
from the same optimization steps are sent from node j 
to the descendent nodes k ^ i, informing them of the 
particular arguments to be used for calculating subse- 
quent messages. From Eqs. (|22l) and (|23|) . this backward 
message is given by 



Vjk^Vjk 



Ajk + <j>' jk + (Hj 
Bjk + <t>% 



(27) 



In the second method, node j first receives the mes- 
sages (Aji, Bji) and current yji from the ancestor node i, 
and update the current from yy to yy +£y by minimizing 
the total cost 



— A^eij + ^BijE i: j + Aji( y^j e^ yji) 



hj £ ij 



(28) 



In Eq. ([28]) . the first two terms represent the message 
from i to j, the next two terms from j to i, and the 
last two terms the transportation cost in link (ij). The 
optimal solution is 

BijVij - Ay - Bjiyji + Aji - ftj + ^yij 
Vok< „ ~5 — " 1 — ■ ( 29 ) 



B, 



Bu 



Both methods work well for the quadratic cost function. 

Implicit in the information-provision messages is the 
independent update of the currents yy and y^ in the 
opposite directions of the same link. This allows us to use 
the criterion yy = —yji as a check for the convergence 
of the algorithm. We have used this in our simulation 
program by requiring the root mean square average of 
yy + yji to be less than a threshold. Another usage of 
the information-provision messages is in monitoring the 
optimal cost function during simulations. This saves the 
extra step of calculating the current associated with a 
link in the conventional Bethe approach. 

An alternative distributed algorithm can be obtained 
by iterating the chemical potentials of the nodes. 
The Kiihn- Tucker condition requires that the terms 
Mi(Xy Aij + Aj) in Eq. ([2]) vanish. Eliminating yy 
in terms of the chemical potentials, fii can be expressed 
in terms of [ij of its neighbors, namely, 

Hi = min^r^o^O); gi(x) = 22A i j[(p']~ 1 (iJ,j-x)+A i . 

(30) 
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For the quadratic load balancing task, <p(y) = y 2 /2 and 



VI. THE HIGH CONNECTIVITY LIMIT 



ijfij +Ai ,0 



(31) 



This provides a local iteration method for the optimiza- 
tion problem. We may interpret this algorithm as a 
price iteration scheme, by noting that the Lagrangian 
in Eq. ([2]) can be written as 



L 



(ij) 



constant, 



(32) 



where 



Lij = <t>(Vij) + (Mi - ^j)Vi. 



(33) 



Therefore, the problem can be decomposed into indepen- 
dent optimization problems, each for a current on a link. 
fXi is the storage price at node i, and each subproblem 
involves balancing the transportation cost on the link, 
and the storage cost at node i less that at node j, yield- 
ing the optimal solution. This provides a pricing scheme 
for the individual links to optimize, which simultaneously 
optimizes the global performance [221 ]. 

It can be easily verified that the message-passing al- 
gorithm, in the two-parameter approximation, yield so- 
lutions identical to the price iteration algorithm, which 
is exact, as long as the connectivity is sparse and the al- 
gorithms converge. Indeed, simulations provided in sec- 
tion IVIII show that the two algorithms yield excellent 
agreement with each other. 

One can proceed with the verification by noting from 
Eq. (|2T|) that Hij — —<P'jk~A.jk for all k ^ i at the steady 
state. Since foj is independent of the node i receiving 
the message, one can write /Xjj as fij. Similarly, using 
Eq. Ajk = -fJ- 3 k = —Hk- We then have 4>' jk = 

Mfc — Mji whose inverse relation is Eq. ©. The resource 
constraint Eq. (JTJ) then leads to Eq. 

The result that the first order message converges to 
the exact result of the chemical potential at the steady 
state justifies the simplification of the message to two 
parameters. It illustrates that higher order messages are 
not required for the precision of the optimal solution, as 
long as the algorithm converges. This is natural for the 
quadratic cost, for which it can be verified that the vertex 
free energies are piecewise quadratic functions of the cur- 
rents with continuous slopes. In addition, exact solutions 
can be found for other cost functions, as described in Sec- 
tion lVIlfl Though the second order messages do not play 
a role in the final solution, they are useful in tuning the 
intermediate steps for faster convergence. The situation 
is reminiscent of the use of both gradient and curvature 
information in many gradient-based optimization tech- 
niques. 



Both the recursive (JT4J) and message-passing equa- 
tions (|25p - (j2"6")) can be solved numerically as will be shown 
in the next section. However, scaling laws of the quan- 
tities of interest can also be derived analytically in the 
limit of high connectivity. 

We restrict the analysis to the case of quadratic cost 
function cj)(y) — y 2 /2. In the limit of large c, Eq. 
converges to the result 



Bij - 



S(-My) 



(34) 



and the currents scale as c _1 . Therefore, the task of sat- 
isfying the capacity constraints is shared by a fraction of 
0(1) of the descendents. As a result, the collective effects 
of the descendents on a node can be expressed in terms of 
the statistical properties of the descendents. Using this 
scaling property of the currents, Eq. (|2"6")1 reduces to 



Aij = max | — 



(35) 



By virtue of the law of large numbers, it is sufficient 
to consider the mean tua and variance a\ of the mes- 
sages A^. Respectively, they scale as c _1 and c -2 with 
J2k^i AjkAjk being self-averaging. Hence, we can write 

A^ — —(crriA — Aj)9(c77i J 4 — Aj). (36) 

Averaging over A, drawn from a Gaussian of mean (A) 
and variance 1 used in our numerical studies, one obtains 
a self-consistent expression for the parameter £ = cm a — 
(A) 



(A) 



/2tt 



mo- (37) 



A. Current distribution 

To obtain the current distribution, one considers the 
vertex free energies of both ends of a link. For a current 
yij flowing from j to i, the total energy is given by 



E = A 



ij Vij 



Q(~M»j) 2 

2c Vij 



A, 



Vij 



2c 



■vl 



1 2 

2 ' 
(38) 

where we have approximated the working points of the 
messages to be ytj = 0. This is justified since the mag- 
nitudes of the messages are 0(c _1 ), and g/y ~ cr 1 . Min- 
imizing the energy with respect to the current j/y, one 
finds 



1 



V>j 



- [{cm A - Aj) 0(cm A - Aj) 



-(cm a — Aj)Q(cmA — Aj)] . 



(39) 
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Hence, the current distribution is given by 

' 1 



P(y) = / dA lP (A l ) / dA 2 p(A 2 )8 



(cm a - Ai) 



x9(cTOyi — Ai) — (cm a — A 2 )0(cmA — A2) 



(40) 



For the Gaussian distribution of capacities, one obtains 
exp(-^£) fey -2^ 

P(y) = 2 v = ' rr 1 J 



exp 



(cy-O 2 



+ H(0 2 S(y). (41) 



This shows that the distribution P(cy)/c, obtained by 
rescaling the argument by c _1 , is independent of c, and 
depends solely on the average capacity (A) through £. In 
particular, the fraction of idle links is given by 



P(y = 0) = H(Cf 



(42) 



The physical picture of this scaling behavior is that the 
total current required by a node to satisfy its capacity 
constraint is shared by the links. 



B. Average energy 

Using Eq. (f39|) . the average energy per link can be 
written as 

(4>) = ^{(N - A) 2 e(c77M - A)) 



((cm a - A)@(cm A - A))' 



For the Gaussian capacity distribution, it becomes 
1 



(^ = -[m)~h(m 



(43) 



(44) 



where 



h(0= Dz(£-z) = 



_£1 
e 2 



+ &{-£), (45) 



J 2 (0= / Dz(t-zf=Z^L + (e + l)H(-0. (46) 

J-oo V 27T 

We are also interested in how the energy is distributed 
in the network. Consider the average energy per link 
(0| A) among those links connected to nodes of capacity 
A. Using Eq. ([55]). 



<0|A> 



dA 2 p(A 2 ) 



(cmA — A)0(cm J 4 — A) 
2 



(cmA — A 2 )Q(cm A — A 2 ) 



(47) 



For the Gaussian capacity distribution, this becomes 



1 



(0|A) = [i 2 (0 - (h(0 2 - a 2 ) e (h(0 - A)] 



(48) 



C. Chemical potential distribution 



To obtain the distribution of the chemical potentials, 
one follows a similar treatment and considers a central 
node fed by c descendents. Introducing a Lagrange 
multiplier to enforce the capacity constraint, one replaces 
the energy minimization problem by the Lagrangian 



A ojVo 3 + —„ — Vo. 



1 



2c 



1, ■ 2 V °i 



(49) 



The currents are given by j/oj — — ^oj — Mi an< 4 the chem- 
ical potential by 



fx = mm l — 



Ao-^A 0j 



,0 



In the large c limit, the approximated expression for /1 
becomes 

(j, = - (A - cm A ) 6 (cm^ - A ) , (50) 
c 

and the chemical potential distribution is similarly de- 
rived 



dA p(A)6 



(A — cmA) — M 



dA p(A)S(p). 



(51) 



For the Gaussian capacity distribution, it reduces to 



P( M ) = 



exp i 



V2^A 



6(- M ) + (52) 



This shows that the distribution P(cp)/c, obtained by 
rescaling the argument by c _1 , is independent of c, and 
depends solely on the average capacity (A) through £. In 
particular, the fraction of unsaturated nodes is given by 



P(ji - 0) = H(Z). 



D. Resource Distribution 



(53) 



We define the resource at a node i by 



(54) 
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The currents are obtained by minimizing 



A ., | e(-w) . 



(55) 



subject to the constraints Ylj -^ijVij + A* > 0- Intro- 
ducing the Lagrange multiplier /ij for the constraint, we 
obtain 

r, = max[A; — cm^, 0]. (56) 
Hence, the resource distribution is given by 



P(r) 



dA p(A)S(A - cm A -r) + 



dA p{A)6(r). 
(57) 

For the Gaussian capacity distribution, it reduces to 

= exp[-l(r^-0 2 ] e(r) + H( _^ (r) . (58) 



V2tt 

This shows that the resource distribution becomes inde- 
pendent of c in the large c limit, confirming the picture 
that the current in a link scales as c _1 , summing up to a 
total current of c° satisfying the resource requirement of 
the nodes. 



E. Dynamics 

To analyze the dynamics in the limit of large c, one 
considers random sequential updates using the algorithm 
presented in section [Vl Time is divided into steps of size 
At = 1/cN. At each time step, a directed link from node 
j to i is randomly chosen, such that each directed link is 
chosen exactly once in each integer interval of time, and 
the messages of the links are updated. 

The current yjt, for a link feeding node j, is updated 
in the backward messages corresponding to the forward 
ones from j to i ^ k. (This implies that yj k is updated 
K times in a time step. As will be shown, the algorithm 
uses information updated in the previous step to compute 
the optimal current. Since the previous step lies in the 
previous interval, this approach is not the most efficient 
for monitoring the evolving average energy.) 

Denote the average of message Ajk(t) over all links at 
time t as m^(i), and m,A(t) the expected value of Ajk(t) 
when it is updated at time t. Then, for a time i in the 
interval between to and to + 1 this leads to the dynamical 
equation 



dt 



= rh A (t) - m A (to) for t < t < t Q + 1 . (59) 



Suppose the link ij is updated at time t, according to 
Eq. (J2U). Then the average over link ij becomes 



cm A (t) 



cm A (t) 



dA p(A) (cm A {t) - A) . (60) 



For the Gaussian capacity distribution, this becomes 
r€(t) 



crh A (t) 



Dz (£(t) - z) = h(t(t)) 



(61) 



where £(t) = cm,A{t) ~ (A). It is convenient to convert 
Eq. ([59l to a dynamical equation for £(f) 



dm 

dt 



= h(£(t)) - £(i ) - (A) for t < t < t + 1 , 



with the initial condition £(0) = —(A). 

The dynamics of the average energy depends 
on whether one adopts the backward or forward 
information-provision method, described by Eqs. (|27p 
and respectively. We first consider the case of back- 
ward information-provision. Suppose the link from j to 
i is updated at time Uj in the interval between to and 
to + 1. Using Eq. (|2?J). 



Vjk{ti. 



~Ajk(t jk ) — VijiUj), 



(62) 



where Ajk(t~ k ) is given by Eq. (|36p and tj k is the instant 
that the link from k to j is previously chosen for update. 
With probability to + 1 — Uj , tj k lies in the previous time 
interval between to — 1 and to- Otherwise, t~ h lies between 
to and tij. 

To calculate the average energy (4>(to + 1)), one can ex- 
press (y? k ) in terms of the moments (A? k (tJ k )), (t^jiUj)} 
and (Ajk(t~ k )nij(tij)) . Hence, on averaging over all de- 
scendents k, denoted as ( ) k , the second moment of 
Ajk(tJ k ) is given via Eq. |[3B]) by 



A jk (t jk y 



dtj k (t + l-t ij )+ [" dtj k 

to-1 Jt 
2 



(63) 



x / dA p(A) cm A (t jk ) - A 
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cm A (t jk ) 



A 



For the Gaussian capacity distribution, this becomes 

1 



(64) 



where 



dt'(t + l-t)+ / dt' 
to-l Jto 



hW))- (65) 



Averaging over (ij) at time to + 1, 
which can be simplified to 



dt J 2 (t), 



(66) 



in 



ij k 



dt J 2 (C(t)) 



to-1 



io + 1 



dt (t + l-t)/ 2 (£(t)) 



(67) 
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A similar calculation follows for the second moment of 
Hij (tij ) at time to + 1, leading to 



dt J 2 (£(t)), 



(68) 



in 



for the Gaussian capacity distribution. 

Similarly, the expression for the crossed moment in the 
case of Gaussian capacity distribution is 



1 



io + l 



dt h(t)h(Z(t)), (69) 



to 



which can be simplified to 



t +l 



dt h(£(t)) 



t -l 



dt (t Q + l-t)h(Z(t)) 







io + l 



to 



(70) 



Hence the average energy per link in the case of Gaussian 
capacity distribution is 



Wo + 1)> 

fio + l 




dt i 2 (£(t)) 



t -l 



dt (t + l-t)J 2 (£(t))-2 



to+1 



(t + l-t)/i(£(t)) 



IV to 
io + l 



to-1 
io + l 



dt I 2 (Z(t)) 



dtlMt)) 

(71) 



to 



Next, we consider the case of forward information- 
provision described by Eq. (|2"9")) . whose large c limit is 
given by 



Uij(tij) — AijfaA + Ajiybji), 



(72) 



where the link from j to i is updated at time tij in the 
time interval between to an d to + 1, t~ and t~ are respec- 
tively the instants that the links from j to z and from i to 
j are previously chosen for update. Derivation analogous 
to the backward information-provision method yields 



dt 







■ pto+1 


/•to+1 ~) 






/ dtJi(t) 


+ dt J 2 (t) L (73) 






-Jt 





which can be simplified to 
1 f 3 r to 



Wo + 1)> 



2c 2 1 2 



dt J 2 (£(t)) 



(74) 



to + 1 



dt (t + l-t)/ 2 (^)) 



to 



io+l 



dt (t + 1 -t)h{t) 



dthm)) 

Uto-l 
to - 2 
dt J a (f(t)) 



Eq. (|72j) shows that the forward information-provision 
method uses only outdated information to calculate the 
current. Consequently, the convergence of the average 
energy is slower than that of the backward information- 
provision method by about half a step. 



VII. NUMERICAL RESULTS 

To examine the accuracy of the theoretical results and 
the efficacy of the message passing algorithm of section fVl 
we conducted a scries of numerical experiments. First, 
we solved numerically the recursive equation (|14[) using 
population dynamics for various connectivity values and 
obtained various quantities of interest from it, including 
the energy, current and chemical potential distribution. 
Second, we carried out simulations using the algorithms 
of Eqs. (|26|) and (j25|) and compared the results to those 
obtained from the numerical simulations. We then com- 
pared the scaling properties of the results with respect 
to the connectivity with the theoretical scaling obtained 
in section ["VTl All experiments in this section have been 
carried out using the quadratic cost <fi(y) — y 2 /2. 

To solve numerically the recursive equation f) 14[) we 
have discretized the vertex free energy functions i*V(y|T) 
into a vector, whose i th component is the value of the 
function corresponding to the current j/j. To speed up 
the optimization search at each node, we first find the 
vertex saturation current drawn from a node such that: 
(a) the current drawn by each descendent node separately 
optimizes its own vertex free energy plus the transporta- 
tion cost to the node being considered. For descendent 
nodes fc, this current yl is given by 



y% = argmin [F v (y k \T k ) + cf)(y k 



(75) 



(b) The resource of the node considered is just used up. 
For node j, its vertex saturation current y| is given by 



(76) 



For current below this saturation point, the vertex 
free energy remains constant. That is, Fv(yj\Tj) — 
Fv(yj\Tj) for yj < yj. Hence, this provides a conve- 
nient starting point for searching the optimal solutions. 
The drawn current can then be increased in small dis- 
crete steps, and the optimal solution can be found, for 
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example, using an exhaustive search, by varying the de- 
scendent currents in small discrete steps. This approach 
is particularly convenient for c = 3, where the search 
is confined to a single parameter. For larger values of 
c, other search techniques, such as conjugate gradient 
search, can be used. 

These recursive equations provide a discretized repre- 
sentation of the vertex free energy Fv(y\T), from which 
various properties of the system can be calculated. 

Average Energy - To compute the average energy, we 
randomly draw c — 1 nodes, compute the optimal current 
flowing between them, and repeat the sampling to obtain 
the average. 

The results of iteration for a Gaussian distribution 
p(A ) with variance 1 and average (A) were described in 
[12|. There we found that the convergence rate slows 
down when (A) decreases towards 0. A cusp in the re- 
laxation rate dependence on (A) exists at (A) « 0.45, 
where the fraction of saturated nodes is about 0.48, close 
to the percolation threshold of 0.5 for c = 3. Hence the 
cusp is probably related to the appearance of a percolat- 
ing cluster of negative resources, which draws currents 
from increasingly extensive regions of nodes with excess 
resources to satisfy the demand. 

Dependence on the connectivity - We have presented in 
[l2l ] evidence that the currents scale as c _1 , so that after 
rescaling, the average energy c 2 (</>) (see Fig. 1 inset), the 
current distribution P(cy)/c, and the resource distribu- 
tion P(r) become principally dependent on the average 
capacity (A), and only weakly dependent on the connec- 
tivity c. The scaling property extends to the dynamics of 
the optimization process. All results of increasing con- 
nectivity approach those of the high connectivity limit 
derived in Section PVTl 

Here we further study how the high connectivity limit 
is approached from increasing finite values of c. We define 
the empirical scaling factor s by 

/ limc^oo c 2 {4>) , 

S =V — m — • (77) 

s is expected to converge to c in the high conenctivity 
limit. As shown in Fig. 1, the empirical scaling factor 
corresponding to different values of (A) collapse on a lin- 
ear curve with slope 1. The best fit is s w 1.02c — 0.43. 
It is remarkable that the network behavior already con- 
verges to scaling at low values of c. 

We make use of this empirical scaling factor to rescale 
the current distribution. The current distribution con- 
sists of a delta function component at y = (Fig. 1(b) 
inset |2l|) and a continuous component, whose breadth 
decreases with (A) . Excluding the delta function compo- 
nent, the continuous distribution after rescaling is shown 
in Fig. 1(b). The approach to the high connectivity 
limit is even faster when compared with that by setting 
the scaling factor to be c — 1 [12| • 

Energy dependence on node capacity - We divide the 
nodes into ten groups according to their node capacities. 
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FIG. 1: (color online) Results for N= 1000 and tj>(y) = y 2 /2. 
(a) The dependence of the empirical scaling factor s on the 
connectivity c for different values of (A). Line: best fit. Inset: 
c 2 (4>) as a function of (A). Symbols: c=3 (O), 4 (□), 5 
(0), 10 (A), high c (line), (b) The continuous component of 
the current distribution P(sy)/s for (A) = 0.02,0.5, 1. Lines: 
c = 3 (solid), 4 (dotted), 5 (dashed), 10 (dot-dashed), high c 
(long dashed). Inset: P(y = 0) as a function of (A), symbols: 
same as (a) inset. 

Nodes in group 1 have their capacities among the top 
10%, those in group 2 the next highest 10%, and so on. 
For each group, we then calculate the average energy per 
link for those links connected to the nodes of that group. 
The general trend can be seen in Figs. 2(a-b). Group 1 
consists of the richest nodes. Since they are the resource 
providers to the rest of the network, their connected links 
have high average energy. On the other hand, group 10 
consists of the poorest nodes. Since they are the resource 
consumers of the network, their connected links also have 
high average energy. Compared with group 1, their aver- 
age energy is even higher due to the enforcement of the 
resource constraints, Eq. {!]). By comparison, the groups 
in between consist of relay nodes which typically receive 
resources from the richer ones and provide resources to 
the poorer ones. The energies of their connected links 
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FIG. 2: (color online) Results for N = 1000 and tj>(y) = y 2 /2. 
(a) The rescaled energy per link connected to the ten groups 
of nodes with decreasing ranges of capacities at (A) = 0.02 
for different connectivities indicated in the legend, (b) Same 
as (a) but at (A) = 0.2. Inset: Comparison of the curves 
for (A}=0.02, 0.1, 0.2, 0.3 (top to bottom). Lines: high c. 
Symbols: c = 10 and (A)=0.02 (O), 0.1 (□), 0.2 (0), 0.3 
(A). 



have intermediate averages. Figs. 2(a-b) show that these 
different roles played by nodes of different capacities can 
lead to a significant difference in the average energies of 
their connected links. 

Furthermore, when one compares networks of different 
connectivities, one finds that the rescaled energy curves 
become only weakly dependent on the connectivity. The 
convergence to the high connectivity limit is rather fast. 

Fig. 3(b) inset shows the rescaled energy curves in the 
high connectivity limit for different (A). When (A) in- 
creases, a plateau starts to develop among the groups 
of richer nodes, indicating that the rich nodes become 
unsaturated in their resource provision, so that the en- 
ergy of their connected links become independent of their 
excess resources. They have A > Zi(£) according to 
Eq. (UHJ). Simulation results for c = 10 presented in the 
same figure provide support to this behavior. In fact, the 




FIG. 3: (color online) Results for system size A^ = 1000 and 
4>(y) = y 2 /2. The chemical potential distribution P(sfi)/s for 
(A) = 0.02, 0.5, 1. Lines: c = 3 (solid), 4 (dotted), 5 (dashed), 
10 (dot-dashed), high c (long dashed). Inset: P(/i = 0) as a 
function of (A). Symbols: c = 3 (O), 4 (□), 5 (0), 10 (A), 
high c (long dashed). 



development of this plateau is already visible in the sim- 
ulation results of finite connectivities in Fig. 3(b), whose 
trend shows that the homogenization of energy among 
the links connected to the rich nodes is increasingly ef- 
fective when the connectivity increases. 

Chemical potential distribution - Both the message- 
passing and price iteration algorithms allow us to study 
the distribution P(/i) of the chemical potentials /i. P(/x) 
consists of a delta function at \i = (Fig. 3 inset) and 
a continuous component. The width of the continuous 
component increases when (A) decreases. Note the con- 
currence of low average resource and highly negative val- 
ues of /i, consistent with the economic interpretation of 
\x as the storage cost of a node. The scaling property of 
the distribution is illustrated in Fig. [3] For (A) = 0.5 and 
1, the distributions collapse well even for relatively low 
values of c. For (A) = 0.02, a considerable dependence 
on c remains after rescaling. However, the approach to 
the high c limit is visible. 



VIII. GENERAL COST FUNCTIONS 

The cost used so far was the, rather simple, quadratic 
cost. In this section we examine the applicability of the 
message-passing algorithm for more general costs. Two 
representative costs will be studied: 

(a) The anharmonic cost function is used to model the 
effects of costs rising faster than quadratic 



<Kv) 



2 



(78) 



(b) The frictional cost function is used to model the 
effects of frictional forces in resource allocation, which 
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add an extra cost per unit current in a link irrespective 
of direction. Hence it is also useful in networks with 
multiple classes of traffic sharing the same links. The 
cost function takes the form 



<t>{y) = ~2 +v \y\ 



(79) 



Note that these cost functions represent two distinct 
cases. The former has well defined first and second 
derivatives for all of its arguments. In the latter case, the 
frictional cost function does not have a second derivative 
at y = 0. There is a kink in the cost function at the point 
of zero current, thus increasing the preference for idle 
links, or equivalently the reluctance to switch on a link. 
As will be shown, the convergence of the message-passing 
algorithm is much more difficult, and modifications are 
necessary. 



A. Anharmonic cost 

1. Price iteration 

Introducing Lagrange multipliers for the capacity con- 
straints, the function to be minimized is 




where Ci is the set of neighboring nodes of i. Optimizing 
with respect to yij = —yji, one obtains the relation 



Vij 



+ w l/u-j - Mil 



sgn(/ij - m). (81) 



Using the capacity constraints, the chemical potential [ii 
is given by jii — min(g~ 1 (0), 0), where is the inverse 
of the function 



■(*) = E - 



1 

- + U \Hj 



1 2 



sgnfjttj - x) + A,. 

This provides a price iteration scheme. We solve this 
equation using the bisection method, noting that the 
function is monotonic non-increasing. This requires one 
to know the solution bounds. Let /Li max and /x m i n be 
the maximum and minimum of the chemical potentials 
among the neighbors of node i. Examining the cases of 
Aj > and Aj < separately, one finds the range for the 
solution of x 



Mmin 

min < ^ ma> . 



c 



1 



u\Ai 



,0 
u |A 



< x < 



,0 



.(83) 



2. Message-passing 

Since the cost function has well defined first and second 
derivatives for all of its arguments, the message-passing 
algorithm formulated in section [V] is directly applicable: 



At 



Q(-Wi) 



E (B jk + <j/! k y 

k£Cj\{i} 



(84) 



(85) 



where 



fHj 



+Aj - Vij 



[yjk-iA^ + ^iB^ + ^y 1 ] 
Lke£ 3 \{i} 

-1 ^ 



lkec s \{i} 



,0 



(86) 



(87) 



and the backward message is given by 

Ajk + 4>' jk + Hij 
Vjk <— Vjk 75 ~TaTi • 

For the anharmonic cost function, 

<P'jk = Vik + uy] k sgn y jk , and 4>" k = 1 + 2u\y jk \ 

3. Simulation results 



To study the behavior of the various algorithms in the 
case the anharmonic cost function, in comparison to the 
quadratic cost, we carried out simulation under similar 
conditions to those of section IVIH 

Figure^a) shows the average energy per link as a func- 
tion of iteration steps of the price iteration algorithm for 
the anharmonic cost function. Figures H^b)-(d) show the 
distributions, P{y), P(v) and P(fi) of the currents, re- 
sources and chemical potentials respectively, at the cor- 
responding values of (A). The results obtained are very 
similar to those of the quadratic cost function and show 
the same qualitative behavior, as can be seen by com- 
paring Figs.QIa)-(d) with Figs.[]Ja) inset, 0(a), |2{b) and 
El[a) respectively. 

In Figs. 03a)-(d) we compare the behavior of the price 
iteration and message-passing algorithms by plotting the 
average energy per link, the fraction of idle links P(y = 
0), the fraction of saturated nodes P(r = 0), and the 
fraction of unsaturated nodes P(fi = 0) respectively, as 
a function of (A) for both algorithms at the anharmonic 
strengths u = 1 and 3. Both methods converge to the 
same value throughout the range examined and for the 
two u values examined. 

To provide a more quantitative comparison of the cost 
functions, we also plotted in Figs. G3b)-(d) P(y = 0), 
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FIG. 4: Results for N = 1000, c = 3, the anharmonic cost 
function with u = 1, and 1000 samples, (a) (0) obtained by 
the price iteration algorithm, utilizing Eqs. (|82[) and (|83(l . as 
a function of t for (A) =0.1, 0.2, 0.3, 0.4, 0.5 (top to bottom), 
(b) The corresponding current distribution P{y)- (c) The cor- 
responding resource distribution P(r). (d) The corresponding 
chemical potential distribution P(fi). 



P(r = 0) and P(/i = 0) respectively, for the quadratic 
(u = 0) cost function. Simulations have been carried out 
under the same conditions (N = 1000, c = 3 and 1000 
samples). It is remarkable that there is little difference 
between the quadratic and anharmonic cases. The dif- 
ferent cost functions merely change the continuous com- 
ponents of these distributions, leaving the delta function 
components effectively unchanged. 



B. Fractional cost 



FIG. 5: Results for N — 1000, c = 3, the anharmonic cost 
function, and 1000 samples, (a) Average energy per link {(f>). 
(b) The fraction of idle links P(y — 0). (c) The fraction of 
saturated nodes P(r = 0). (d) The fraction of unsaturated 
nodes P(n = 0). Symbols in (a)-(d): results obtained by 
the price iteration algorithm for the quadratic cost function 
(u — 0) (<), and the anharmonic cost function with u — 1 
(O) an d u = 3 (0); results obtained by the message-passing 
algorithm are also shown for u = 1 (□) and u — 3 (A). 



Since gi(x) is monotonic non- increasing, and piecewise 
linear, we have a fast way to solve the equation by finding 
the function at its 2c turning points, located at x — fij ± 
v. If <7j (Unun — v ) < 0, the solution is given by (fi m i n — 
v ) - M/Wn- v)/c; if 3i(/i m m- v) > 0, then among the 
turning points with gt{x) > 0, one finds the one with 
the minimum value of g%{x"), and the solution is given by 
x - g i (x)/g' i (x + ). 



1. Price iteration 



2. Message-passing 




Introducing Lagrange multipliers for the capacity con- 
straints, the function to be minimized is 



'' I'/-./ I ~ X^M X! Vv + A i 



Optimizing with respect to = — yji, one obtains 

Vij = [Vj -m-v sgn (fXj - m)] 9 - m\ - v] . (89) 

Using the capacity constraints, the chemical potential is 
given by /ij = min (g^~ (0),Q), where g~ 1 is the inverse 
of the function 

(90) 



Message-passing algorithms for the currents have not 
been successful in this case, presumably due to the di- 
vergence of the second derivative at y = 0. This in turn 
requires some form of regularization that causes the ef- 
fects of friction to be exhibited in the first, but not the 
second, derivative in finite systems. This inconsistency 
prevents the algorithm from converging. 

We present here an approach based on the chemical 
potential representation. To formulate an appropriate 
version of message-passing for this case, we return to the 
minimization of the energy of section |V1 namely, 



4*= E 

+v\y jk + £jk\ 



Aj k e jk + -B jk e 2 jk + ^(y jk + £]kf 



(91) 
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subject to the constraints 

(Vik + ejk) - Vij + Aj >0 , (92) 

keCj\{i} 

introduced by employing the Lagrange multipliers jiij. 
The optimal solution is given by /iy = min[^ 1 (0), 0], 
where g~j l is the inverse of the function 

9H ( x ) = H (! + B jk )- l [B jk y jh - A jk 
keCj\{i} 

-x-v sgn {B jk y jk - A jk - x)\ 

x 9 [\B jk y jk - A jk - x\ - v] + Ai - y^ . (93) 

The forward messages become 

Aij < Hij , (94) 

B < Q(-Mij) 

V E (1 + Bj k )~ l & [\B jk y 3k - A jk - Hij \-v]' 

feG£j\{i} 

To complete the algorithm, we need information- 
provision messages to determine the drawn current t/y 
at which the messages should be computed. Analogous 
to the case of quadratic cost functions, two methods are 
proposed. 

In the method of backward information-provision mes- 
sages, the backward messages are computed directly from 
the optimization of Eq. (|9ip and sent from node j to the 
descendent nodes, namely, 

Vjk «- (1 + B jk )~ x [Bj k y jk - Aj k - (jtij - v sga(B jk y jk 
-Aj k - fiij)]Q[\B jk y jk - Aj k - lMj\—v] . (95) 

This algorithm reduces the error at steady state to a level 
that is still rather high. A careful examination of the 
solution shows that the error is contributed by oscillatory 
solutions between yij and yji. Hence a learning rate rj is 
introduced: 

Vjk «— (1 - ij)Vik + ??(! + Bj k y x [Bj k yj k - A, jk 

-fiij - v sgn(Bj k y jk - A jk - fiij)] 

x 6 [\B ]k y jk - Aj k - [iij \-v}. (96) 

The case fj — 1 corresponds to the original algorithm. 

In the method of forward information-provision mes- 
sages, a node first receives the messages from the ancestor 
immediately before it updates its messages. The working 
point is obtained by minimizing the energy 

Eij = AijSij + -^11,, + Aj^—yij — e l j — yji) 

+v\vij+E iJ \, (97) 

with the optimal solution 

yij < (1 + Bij + Bji) 1 [Bijyij — A^ — Bjiyji + Aji 

-v sga(Bijyij - A iit - I! ,,</,, + Aji)] 

x 6 {\Bijyij ~ A^ - Bj^ + Aji \-v). (98) 



For further improvement, a learning rate is introduced, 
namely, 

yij *- (1 - v)Vij +v( l + B ij + IS,, :• 1 [B^yij ~ 
-Bjiy J2 + Aj t - v sgn(Bijyij - A^ - /»',,//,, + Aji)] 
x6 [\Bijyij - Aij - B ji y jl + Aji\ - v) . (99) 

3. Simulation results 

To study the behavior of both price iteration and mes- 
sage passing algorithms in the case of the frictional cost 
function, we carried out simulations under similar con- 
ditions to those of section IVIII Figure E)^a) shows the 
average energy per link as a function of iteration steps 
of the price iteration algorithm. Figures Ob), (c) and 
(d) show the current, resource, and chemical potential 
distributions, P(y), P(r), and P(/j), respectively for the 
various (A) values. 

The results shown in Figs. E{a)-(c) exhibit a similar 
qualitative behavior to those of the quadratic and an- 
harmonic cost functions. However, there is a substantial 
difference in the chemical potential distribution, shown in 
Fig.[(Jd) as a pseudogap develops in the range v < \i < 0, 
as well as a kink at /i = — 2v. From Eq. (|89p one notices 
that a link becomes idle when the potential difference at 
its vertices is less than v, accounting for the existence of 
the pseudogap. 

A quantitative comparison between the results ob- 
tained by price iteration ([90]) and message-passing ([94]) - 
(1951) algorithms (77 = 1, no learning rate) are presented 
in Fig.[7J A comparison of the average energy per link as 
a function of (A), the fraction of idle links P(y = 0), 
and the fraction of unsaturated nodes P(/i = 0) are 
shown in Figs. [TJa), (b) and (d), respectively, showing 
good agreement between the result obtained using both 
algorithms. Results obtained by both price iteration and 
message passing algorithms for a friction (v = 1) cost are 
also contrasted with results obtained for the quadratic 
(v = 0) cost in Figs. EKb) and (d). 

As shown in Fig. [71 the price iteration and the origi- 
nal message-passing algorithms yield results agreeing in 
the average energy (a), the fraction of idle links (b), and 
the fraction of unsaturated nodes(d). Compared with 
the quadratic cost function, the fraction of idle links is 
considerably increased after introducing the friction, as 
shown in Fig. [JJb). However, as shown in Fig. [Jjc), the 
message-passing algorithm gives values much lower than 
those of price iteration, and is inconsistent with the re- 
sults in Fig.Efd). 

The resource distribution in Fig. [5{a) explains the dis- 
crepancy. Compared with the results in Fig. [HJc), the 
sharp peak at r — is broadened to finite values of r. 
This shows that the original message-passing algorithm 
is not precise in computing the resources. Furthermore, 
the chemical potential distribution in Fig. [5Jb) exhibits 
rough features in the pseudogap, and the jumps near 
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FIG. 6: Results for N = 1000, c = 3, the frictional cost 
function with v = 1 and 1000 samples, (a) (<f>) obtained 
by the price iteration algorithm as a function of t for (A) = 
0.1,0.2,0.3,0.4,0.5 (top to bottom), (b) The corresponding 
current distribution P(y). (c) The corresponding resource 
distribution P(r). (d) The corresponding chemical potential 
distribution P(/j.). 



the edge of the pseudogap are less sharp than those in 
Fig.Hd). 

These unsatisfactory performances of the message- 
passing algorithm can be traced to its non-convergence. 
In message-passing, convergence is monitored by the 
root-mean-square average of ([(yij + Uji) /2] 2 ) 1 / 2 , which 
is expected to approach 0. As shown in Fig. [8jc) for 
the original algorithm (77 = 1), the convergence parame- 
ter reaches 0.04 at t = 500, compared with the value of 
0.0003 for the price iteration algorithm. 

To improve convergence, we introduce a learning rate 
according to Eqs. and As shown in Figs. [5Jc) 

and (d), convergence improves for decreasing r/, but 
is also slowed down. Comparing the two information- 
provision methods, the one using forward information- 
provision messages converges faster. 

As shown in Fig. OJa), better convergence is obtained 
by the forward information-provision messages in 500 
steps. Fig.[5Jb) summarizes the improvement in the frac- 
tion of saturated nodes on introducing the learning rate 
for 500 steps; results obtained using the price iteration 
algorithm are provided for comparison. Obviously, fur- 
ther improvement can be made by increasing the number 
of time steps, and hence depends on the amount of com- 
putational resource one wishes to commit. 

IX. CONCLUSION 

The paper presents a study of inference and optimiza- 
tion tasks of real value edges on sparse graphs under 



FIG. 7: Results for N = 1000, c = 3, the frictional cost func- 
tion with v — 1, and 1000 samples, (a) Average energy {<(>). 
(b) The fraction of idle links P(y — 0). (c) The fraction of 
unsaturated nodes P(/i = 0). (d) The fraction of saturated 
nodes P(r = 0). Symbols in (a)-(d): results obtained for the 
frictional cost function by the price iteration algorithm (O) 
and the message-passing algorithm (□); results obtained for 
the quadratic cost function (v = 0) (0). 



given constraints and cost measure. A generic frame- 
work comprising of sparsely connected nodes, represent- 
ing constraints, and edges representing current variables 
connecting them, is used as the basic framework for the 
inference and optimization tasks. Inference of real val- 
ues attributed to the graph edges has very rarely been 
studied before within and outside the statistical physics 
community. Although both theoretical analysis and al- 
gorithmic solutions can be obtained for any connectivity 
profile, we restricted this study to the case of fixed con- 
nectivity - c. 

The framework is analyzed using both the replica 
method and Bethe approximation to obtain a set of re- 
cursive equations to be solved numerically. The solutions 
provide numerical results for the free energy, average 
energy, and the distribution of currents, resources, and 
chemical potentials for the various cases. The recursive 
equations also enabled us to obtain scaling rules for var- 
ious quantities of interest as a function of the node con- 
nectivity. In addition, we have devised message-passing 
and price iteration algorithms for solving the optimiza- 
tion problem. The message-passing algorithm is based on 
passing first and second derivatives of the vertex free en- 
ergy, representing the local contribution to the system's 
free energy, thus saving the need to pass the full free en- 
ergy functions as messages. Despite the simplicity of the 
two-parameter messages, they yield exact solutions in the 
limit of sparse connectivity as long as they converge. 

Most numerical studies have been carried out for the 
case of quadratic cost that corresponds to the resource al- 
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FIG. 8: Results for N = 1000, c = 3, the frictional cost func- 
tion with v = 1, and 1000 samples, (a) The chemical po- 
tential distribution P(ji) for (A) =0.1, 0.2, 0.3, 0.4, 0.5 (top to 
bottom), (b) The corresponding resource distribution P(r). 
(c)-(d) The convergence parameter ([(yij + yji)/2[ 2 ) 1 ^ 2 of the 
message-passing algorithm as a function of iteration steps and 
for various r\ values, using backward and forward information- 
provision messages in (c) and (d), respectively. 




0.001 



(b) 




location problem which initiated this study. In this case 
we fixed the nodes capacities, representing biases in the 
local constraints, to quenched variables drawn from some 
Gaussian distribution of given mean (A) and unit vari- 
ance. Numerical results for various parameters values, c 
and (A) , show excellent agreement between the analytical 
and algorithmic approaches both for finite and asymp- 
totic connectivity values. Moreover, they expose an in- 
teresting percolation transition of the clusters of nodes 
with negative resources when (A) varies, giving rise to 
a slowing down of the convergence of the saddle point 
equations below a certain value of (A). 

To study the efficacy of our approach to other cost 
measures we have examined two different costs that in- 
clude anharmonic and friction terms. We have applied 
two different algorithms in these cases based on the price 
iteration and message-passing. Price iterations involve 
solving a nonlinear equation of the chemical potential 
at each step numerically On the other hand, message- 
passing involves updating the messages based on the 
working points estimated from the information-provision 
messages. While the obtained solutions are qualita- 
tively similar to that of the quadratic cost, there are 
also substantial algorithmic and conceptual differences, 
especially in the case of friction cost. For the optimiza- 
tion task studied in this paper, price iteration is simpler 
in implementation and converges better when compared 
with message-passing. However, for future extensions to 
inference problems at finite temperatures, we expect the 
message-passing approach to be more appropriate. It is 



0.5 1 ■ ■ 

0.01 0.1 1 

FIG. 9: The convergence parameter {[(yij + yji)/2] 2 ) 1 '' 2 of 
message-passing algorithms as a function of the learning rate 
r\ at t = 500 with N — 1000, c = 3, the frictional cost func- 
tion with v = 1, and 1000 samples. Symbols: information- 
provision using backward messages (O) an d forward messages 
(□). (b) The corresponding fraction of saturated nodes. Sym- 
bols: same as in (a), and the price iteration algorithm (dashed 
line) . 



also useful to adopt an adpative learning rate as a func- 
tion of time to optimize the performance (24j . 

We believe this research opens a rich area for further in- 
vestigations with many potential applications, especially 
when additional restrictions are imposed and other costs 
considered. More specifically, one may consider band- 
width limited links |23J and other nonlinear costs which 
are of interest in realistic networks. We expect that many 
nonlinear costs may exhibit replica symmetry-breaking 
effects, and it would be interesting to consider how the 
analyses and algorithms should be modified to cope with 
these effects. 
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DAG04/05.SC25 and DAG05/06.SC36) and EVER- and the conjugate parameters Q™ 8 . The replicated and 
GROW, IP No. 1935 in FP6 of the EU. averaged partition function {Z n } becomes 



APPENDIX A: REPLICA APPROACH TO 
NETWORK OPTIMIZATION 

To calculate the averaged replicated partition func- 
tion ([5]) , we employ an integral representation of the step 
function to obtain 



Ay =0,1 i 



n / dMAon / d < 1 . dx ? I 



A, 



xe 



exp 



(Al) 



(id) a 



Collecting terms containing Aij and summing over them, 
one obtains 

^>=^n/^n[/^(A,) 

z * 1 L 

x II/ d < dX ? J ^e^XSW 
1 + z iZj exp [J] (iA? - iAp« - ^) 



><n 



(A2) 



This includes a mixed term of i and j indices. An ad- 
ditional expansion is required to disentangle the two in- 
dices. The product over (ij) can be written as an expo- 
nential function whose argument is 



EE 

ij m—1 



\m-l 



2m 



-2 ~j 



exp E n 



expf^imA>fJ (A3) 
(-im\?vf) r ° 



r a ,s a ,t a a 



which gives rise to the mean-field parameters 

QT, S = ^== E *T ex P (E imX >i ) 

xn(-^Af)^«)^, (A4) 



2n /cN 



\ r,s,m / i L * 



-Ai 



dA? 



m ~ 1 Q^ s 
2m n r a !s Q ! 



x expj VcA £ Q™ £ exp ( £ imA>f J 

^ r,s,m z \ a / 

x nH™*f ) r °(«?)* a ( :i 

xE^ ex p(E^x) IIK 

j \ a /a 



(A5) 



The integration over Zi is dominated by the term m = 1 
in the c" 1 order expansion of the exponential term that 
leads to Eq. ([7]). Both Q T S and Q r ,s are then given by 
the saddle point equations 



Ni - N 2 

Qr.s = -jy and Q r!S = — where 

Ni = J dA P (A)]J J dv a J°°J\ a J ^ 



x exp 

x[[Hir°K) s ° , 

CK 

N 2 = \ J dAp(A) JJ | |°° dA Q | ^ 



x exp 



^ ( «A Q (A Q + cv a ) - ^-(v a ) 2 



X 



c-l 



n 



d = 1 dAp(A) n y ^ |°° d\ a j ^ 



x exp 



X c ,(A6) 
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where X is given by Eq. ((SJ. By virtue of the saddle 
point equations (|A6|) . one can show that 



^r,s - „ 2^11 r U !(-„ _ + \l w ! 
t,u a v 7 



dy 



t a +u a 



Q 



s— t,r+u 



(A7) 



Exploiting the even nature of 4>{y) and relation (|A7|) [20| , 
the expressions for X and Q r s reduce to 



Qr.s 



-i\ a - 



dy 



y=v a 

a 

(A8) 

To better understand the symmetry properties of the 
order parameters, we consider the generating function 
P s (z) and its inversion in Eq. ([9]). Substituting Eq. (|A8D 
into Eq. we reproduce Eq. (TIT)]) , with Dp being 

Dp = j dAp(A) 11 J dv a p d\ a J ^ 



\ vx\> j i\ a {X a + CVa) - ^{Vaf 



ka 



dy 



En 

Sfc fc=l 

P<Kv) 

y=v a 



(A9) 



Once we have represented the order parameters Q r , s 
using the generating function P s (z), we can make ex- 
plicit assumptions about their symmetry properties. In 
particular, in the replica symmetric ansatz, we consider 
functions of the form Eq. (TTT]) . 

Notice that the replicas in Eq. (fTTj) are coupled through 
their common dependence on the disordered distribution 
of A. This is different from the SK model, in which 
the dependence on the disorder is integrated out, and 
the interaction between the replicas is explicit. Using 
the ansatz (fTT|) . the recursion relation for P s (z) can be 
replaced by a recursion relation for the function R in 
Eq. Ull), where 



k=l 



Dr= \\j 

x6 



dv k R(v, v k \T k 



At 



(A10) 



x exp 



fe=i 



Ay is the capacity of the vertex fed by c trees T l5 



Letting y = v — z, we consider solutions of Eq. (TT2|) in 
the form 



R(z,u\T) = W(u)Z v (y\T). 



(All) 



Separating the dependence on the current potentials from 
that on the currents, the extra Gaussian distribution of v 
in Eq. (|A10[) prevents the integration of v from diverging. 
Indeed, in the n — > limit and as e — > 0, the function 
W(z) becomes independent of z and can be represented 
as 



V Z7T 



(A12) 



The recursion relation involving the currents becomes de- 
coupled to give 



C-l r « 

Z v (y\T) - II / dy k Z v (y k \T k ) 

k=l 

x6> E yk ~ y + Ay ( T ) exp 



Vfe=l 



fe=i 



exp|-/ln|f[ J dy k Z v (y k \T 



(A13) 



:e E^ fc + Ay exp 



\fc=l 



k=l 



Let _Fy(y|T) be the vertex free energy when a cur- 
rent y is drawn from the vertex of a tree T, given by 
Fy(y\T) = —T\nZy(y\T). Then the recursion relation 
of the free energy is given by Eq. (|13[) , which in the zero- 
temperature limit becomes Eq. (fT4)l . 

To calculate the free energy in the replica approach, 
one returns to Eq. {7J ■ In the second term of the exponen- 
tial argument therein, one eliminates Q r , s by Eq. (|A7|) . 
expresses Q r , s in terms of P s (z>) by Eq. ([9]) and, in turn, 
R(z a , i/\A) by Eq. (jTTJ) . In the third term, one expresses 
X in terms of <3r,s by Eq. (|A8j> and follow similar steps. 
The result is 

xE^^lT^expb^i-^)]} ) (A14) 



In 1 



fc=i 



dv k R(y, v k \T k ) 
/3e 



\fe=i 



— ciH- A j exp ( —(3 $(y — v k ) 



Using the recursion relation Eq. (|12p , one can show that 
the sum of the first two terms in the exponential argu- 
ment vanishes. In the limit n — > one obtains the free 
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FIG. 10: A bipartite graph representation of the resource 
allocation problem with the current variables y on the left 
and the interaction variables Z on the right. 



energy 

{(3F)=-N{\n\ [dvf[[R(v,v k \T k )]e(j2 

\ ly k=l \k=l 



-cv + A exp 



k=l 



(A15) 



Using the vertex free energy representation, one then 
straightforwardly rewrites Eq. (|A15|) as Eq. IT5|) (up to 
a constant). 



where Q{yij) is the posterior of yy given Zj, P{yij) is 
the prior of yy, and R(yij) is the likelihood of Zj given 
Uij . As shown in Fig. \W[ the message from Zj to yy is 
Q(yij), and the message from y^ to Zi is R{yij). Thus, 

R (Vij) = I II d %fe I ?/ij>{%fc :ke Cj\ i}) 

J fce£,)\j 

x J] Q(Vj*) ■ (B2) 



k£Cj\i 



Using 



P(yij) cx exp ( -~<t>{yij) 



(B3) 



(B4) 



and substituting the expression for Q (|B1|) into the P 
equation (|B2[) one obtains 



I Vij,{Vjk ■ k G 

OC 6 Aj - J/y + ^ J/jfc 
V ke£j\i 



x JJ exp f --<f>(yjk) ) R{yjk) 

k£Cj\i 



(B5) 



APPENDIX B: MESSAGES IN THE BAYESIAN 
APPROXIMATION 

To show that the vertex free energies are directly re- 
lated to passed messages in the Bayesian approximation, 
one resorts to formulating the problem on a bipartite 
graph and deriving the closed set of equations that re- 
late to the messages passed from variables to interaction 
nodes and vice versa. 

The representation of the problem as a bipartite graph 
is shown in Fig. 1101 with the current variables y on the 
left and the interaction variables Z on the right. Using 
conventional notations [3| one can easily derive the closed 
set of equations: 



Let Fv{yij\Zj) = —T\n.R(yij). Then on taking the log- 
arithm of both sides of Eq. (|B5[) and normalizing, one 
retrieves Eq. (fT5| if Fy(yij \Zj) is identified with the ver- 
tex free energy F v (yij\Tj), 



, Ujk Uij 
TP 

± av • 

(B6) 



F v ( yij \Zj) = - Tln |il (/ d y^) (E: 
-0Y,(Mvik\Zk) + <Kvjk) 



+Aj exp 



Q(yij) oc P(yij) R(yij) 



(Bl) 



This means that the vertex free energy Fv(yij\Tj) is 
equivalent to —T times the logarithm of the message 
R(yij) from yij to Zi. 
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